Modern virtual assistants use internal semantic parsing engines to convert user utterances to actionable commands. However, prior work has demonstrated that semantic parsing is a difficult multilingual transfer task with low transfer efficiency compared to other tasks. In global markets such as India and Latin America, this is a critical issue as switching between languages is prevalent for bilingual users. In this work we dramatically improve the zero-shot performance of a multilingual and codeswitched semantic parsing system using two stages of multilingual alignment. First, we show that constrastive alignment pretraining improves both English performance and transfer efficiency. We then introduce a constrained optimization approach for hyperparameter-free adversarial alignment during finetuning. Our Doubly Aligned Multilingual Parser (DAMP) improves mBERT transfer performance by 3x, 6x, and 81x on the Spanglish, Hinglish and Multilingual Task Oriented Parsing benchmarks respectively and outperforms XLM-R and mT5-Large using 3.2x fewer parameters.
translated by 谷歌翻译
Artificial Intelligence (AI) is having a tremendous impact across most areas of science. Applications of AI in healthcare have the potential to improve our ability to detect, diagnose, prognose, and intervene on human disease. For AI models to be used clinically, they need to be made safe, reproducible and robust, and the underlying software framework must be aware of the particularities (e.g. geometry, physiology, physics) of medical data being processed. This work introduces MONAI, a freely available, community-supported, and consortium-led PyTorch-based framework for deep learning in healthcare. MONAI extends PyTorch to support medical data, with a particular focus on imaging, and provide purpose-specific AI model architectures, transformations and utilities that streamline the development and deployment of medical AI models. MONAI follows best practices for software-development, providing an easy-to-use, robust, well-documented, and well-tested software framework. MONAI preserves the simple, additive, and compositional approach of its underlying PyTorch libraries. MONAI is being used by and receiving contributions from research, clinical and industrial teams from around the world, who are pursuing applications spanning nearly every aspect of healthcare.
translated by 谷歌翻译
在本报告中,我们为Epic-kitchens-100多实体检索(miR)挑战提出了一个基于视频的预处理(VLP)解决方案\ cite {kevin202222222egovlp}。尤其是,我们将最近发布的EGO4D数据集\ cite {grauman2021ego4d}从预处理数据集,预处理目标和开发集中从egecentric vlp中提升。基于上述三个设计,我们开发了一个预验证的视频语言模型,该模型能够将其自我为中心的视频文本表示为mir基准。此外,我们设计了一种自适应多构度最大损失,以有效地微调模型并为可靠的推理配备双重效果技术。我们最好的单个模型在挑战测试集上获得了强劲的性能,其中47.39%的地图和61.44%的NDCG。该代码可在https://github.com/showlab/egovlp上找到。
translated by 谷歌翻译
在过去的几年中,对MPMRI的恶性前列腺癌患者进行了自动诊断。模型解释和域漂移一直是临床利用的主要路障。作为我们以前的工作的扩展,我们在公共队列上培训了一个定制的卷积神经网络,其中有201名患者和感兴趣区域周围的裁剪2D斑块作为输入,将前列腺的2.5d片用作前列腺的2.5d片。使用Autokeras在模型空间中搜索了输入和最佳模型。外围区(PZ)和中央腺(CG)分别进行了训练和测试,有效地证明了一些不同的东西,PZ探测器和CG探测器有效地展示了序列中最可疑的切片,希望极大地减轻医生的工作量。
translated by 谷歌翻译
语言规划旨在通过分解为更简单的低级步骤来实现复杂的高级目标。这种程序推理能力对于诸如家用机器人和虚拟助手等应用至关重要。尽管语言规划是日常生活中人类的基本技能,但对于缺乏现实世界中缺乏深层常识性知识的大型语言模型(LLM)来说,这仍然是一个挑战。以前的方法需要手动示例或带注释的程序才能从LLM中获取此类能力。相比之下,本文提出了神经符号的因果语言规划师(CLAP),该策划者通过注入常识的提示从LLM中引起了程序知识。 LLMS中的预训练知识本质上是一种未观察到的混杂因素,它在任务和行动计划之间引起虚假的相关性。通过结构性因果模型(SCM)的镜头,我们提出了一个有效的策略,以构建提示作为对SCM的因果干预。我们的策略使用图形采样技术和符号程序执行者,正式从常识知识基础上形成结构化因果提示。拍手在Wikihow和机器人上获得最新的表现,在反事实环境下,人类评估的相对提高了5.28%。这表明在语义和顺序的因果语言规划中拍手的优势。
translated by 谷歌翻译
半弱监督和监督的学习最近在对象检测文献中引起了很大的关注,因为它们可以减轻成功训练深度学习模型所需的注释成本。半监督学习的最先进方法依赖于使用多阶段过程训练的学生老师模型,并大量数据增强。为弱监督的设置开发了自定义网络,因此很难适应不同的检测器。在本文中,引入了一种弱半监督的训练方法,以减少这些训练挑战,但通过仅利用一小部分全标记的图像,并在弱标记图像中提供信息来实现最先进的性能。特别是,我们基于通用抽样的学习策略以在线方式产生伪基真实(GT)边界框注释,消除了对多阶段培训的需求和学生教师网络配置。这些伪GT框是根据通过得分传播过程累积的对象建议的分类得分从弱标记的图像中采样的。 PASCAL VOC数据集的经验结果表明,使用VOC 2007作为完全标记的拟议方法可提高性能5.0%,而VOC 2012作为弱标记数据。同样,有了5-10%的完全注释的图像,我们观察到MAP中的10%以上的改善,表明对图像级注释的适度投资可以大大改善检测性能。
translated by 谷歌翻译
我们介绍了互动室(Thor),这是一个视觉AI研究的框架,可在http://ai2thor.allenai.org上找到。AI2-这是由几乎逼真的3D室内场景组成的,在该场景中,AI代理可以在场景中导航并与对象进行交互以执行任务。AI2-这可以在许多不同的领域进行研究,包括但不限于深入强化学习,模仿学习,通过互动,计划,视觉问答答案,无监督的表示学习,对象检测和细分以及认知模型。AI2的目的是促进构建视觉上智能模型,并将研究推向该领域。
translated by 谷歌翻译
Domain adaptation is critical for success in new, unseen environments. Adversarial adaptation models applied in feature spaces discover domain invariant representations, but are difficult to visualize and sometimes fail to capture pixel-level and low-level domain shifts. Recent work has shown that generative adversarial networks combined with cycle-consistency constraints are surprisingly effective at mapping images between domains, even without the use of aligned image pairs. We propose a novel discriminatively-trained Cycle-Consistent Adversarial Domain Adaptation model. CyCADA adapts representations at both the pixel-level and feature-level, enforces cycle-consistency while leveraging a task loss, and does not require aligned pairs. Our model can be applied in a variety of visual recognition and prediction settings. We show new state-of-the-art results across multiple adaptation tasks, including digit classification and semantic segmentation of road scenes demonstrating transfer from synthetic to real world domains.
translated by 谷歌翻译
Two less addressed issues of deep reinforcement learning are (1) lack of generalization capability to new target goals, and (2) data inefficiency i.e., the model requires several (and often costly) episodes of trial and error to converge, which makes it impractical to be applied to real-world scenarios. In this paper, we address these two issues and apply our model to the task of target-driven visual navigation. To address the first issue, we propose an actor-critic model whose policy is a function of the goal as well as the current state, which allows to better generalize. To address the second issue, we propose AI2-THOR framework, which provides an environment with highquality 3D scenes and physics engine. Our framework enables agents to take actions and interact with objects. Hence, we can collect a huge number of training samples efficiently.We show that our proposed method (1) converges faster than the state-of-the-art deep reinforcement learning methods, (2) generalizes across targets and across scenes, (3) generalizes to a real robot scenario with a small amount of fine-tuning (although the model is trained in simulation), ( 4) is end-to-end trainable and does not need feature engineering, feature matching between frames or 3D reconstruction of the environment.The supplementary video can be accessed at the following link: https://youtu.be/SmBxMDiOrvs.
translated by 谷歌翻译
Non-linear state-space models, also known as general hidden Markov models, are ubiquitous in statistical machine learning, being the most classical generative models for serial data and sequences in general. The particle-based, rapid incremental smoother PaRIS is a sequential Monte Carlo (SMC) technique allowing for efficient online approximation of expectations of additive functionals under the smoothing distribution in these models. Such expectations appear naturally in several learning contexts, such as likelihood estimation (MLE) and Markov score climbing (MSC). PARIS has linear computational complexity, limited memory requirements and comes with non-asymptotic bounds, convergence results and stability guarantees. Still, being based on self-normalised importance sampling, the PaRIS estimator is biased. Our first contribution is to design a novel additive smoothing algorithm, the Parisian particle Gibbs PPG sampler, which can be viewed as a PaRIS algorithm driven by conditional SMC moves, resulting in bias-reduced estimates of the targeted quantities. We substantiate the PPG algorithm with theoretical results, including new bounds on bias and variance as well as deviation inequalities. Our second contribution is to apply PPG in a learning framework, covering MLE and MSC as special examples. In this context, we establish, under standard assumptions, non-asymptotic bounds highlighting the value of bias reduction and the implicit Rao--Blackwellization of PPG. These are the first non-asymptotic results of this kind in this setting. We illustrate our theoretical results with numerical experiments supporting our claims.
translated by 谷歌翻译